Multimodal Emotion Recognition


Multimodal emotion recognition is the process of recognizing emotions from multiple modalities, such as speech, text, and facial expressions.

ATIR: Towards Audio-Text Interleaved Contextual Retrieval

Add code
Apr 22, 2026
Viaarxiv icon

AffectAgent: Collaborative Multi-Agent Reasoning for Retrieval-Augmented Multimodal Emotion Recognition

Add code
Apr 14, 2026
Viaarxiv icon

EEG-Based Multimodal Learning via Hyperbolic Mixture-of-Curvature Experts

Add code
Apr 14, 2026
Viaarxiv icon

Empowering Video Translation using Multimodal Large Language Models

Add code
Apr 13, 2026
Viaarxiv icon

Ambivalence/Hesitancy Recognition in Videos for Personalized Digital Health Interventions

Add code
Apr 14, 2026
Viaarxiv icon

SURE: Synergistic Uncertainty-aware Reasoning for Multimodal Emotion Recognition in Conversations

Add code
Apr 02, 2026
Viaarxiv icon

ActFER: Agentic Facial Expression Recognition via Active Tool-Augmented Visual Reasoning

Add code
Apr 10, 2026
Viaarxiv icon

Dual-branch Graph Domain Adaptation for Cross-scenario Multi-modal Emotion Recognition

Add code
Mar 27, 2026
Viaarxiv icon

MECO: A Multimodal Dataset for Emotion and Cognitive Understanding in Older Adults

Add code
Apr 03, 2026
Viaarxiv icon

Dynamic Fusion-Aware Graph Convolutional Neural Network for Multimodal Emotion Recognition in Conversations

Add code
Mar 22, 2026
Viaarxiv icon